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Abstract. Combinatorial Batch Codes (CBCs), replication-based variant of 
Batch Codes introduced by Ishai et al. in [IKOS04], abstracts the following 
data distribution problem: n data items are to be replicated among m servers 
in such a way that any k of the n data items can be retrieved by reading 
at most one item from each server with the total amount of storage over m 
servers restricted to N. Given parameters m, c, and k, where c and k are con¬ 
stants, one of the challenging problems is to construct c-uniform CBCs (CBCs 
where each data item is replicated among exactly c servers) which maximizes 
the value of n. In this work, we present explicit construction of c-uniform 
CBCs with data items. The construction has the property that 

the servers are almost regular, i.e., number of data items stored in each server 
is in the range — -^/|Tn(4m), ^ -|- ^^ ln(4m)J. The construction is ob¬ 
tained through better analysis and derandomization of the randomized con¬ 
struction presented in [IKOS04]. Analysis reveals almost regularity of the 
servers, an aspect that so far has not been addressed in the literature. The 
derandomization leads to explicit construction for a wide range of values of c 
(for given m and k) where no other explicit construction with similar param¬ 
eters, i.e., with n = is known. Finally, we discuss possibility of 

parallel derandomization of the construction. 


1 Introduction 

1.1 Background 

Batch codes. An {n, N,k,m)-batch code (or (n, N,k,m,t = l)-hatch code^) ab¬ 
stracts the following data distribution problem: n data items are to be distributed 
among m servers in such a way that any k of the n items can be retrieved by reading 
at most one item from each server and total amount of storage required for this 
distribution is bounded by N. Batch codes were introduced in [IKOS04], and their 
primary motivation was to amortize computational work done by the servers dur¬ 
ing execution of private information retrieval protocol by batching several queries 
together while limiting total storage (see [IKOS04] for more details). It is also easy 
to see from the above description that these codes can have potential application 
in a distributed database scenario where the goal is to distribute load among the 
participating servers while optimizing total storage. 

On the theoretical side batch codes closely resemble several combinatorial objects 
like expanders, locally decodable codes, etc., and there is also similarity with Rabin’s 
information dispersal. However, there are fundamental differences of batch codes 
with these objects, especially as far as setting of parameter values are concerned, 
and it seems difficult to set up satisfactory correspondences with these objects. This 
dichotomy makes batch codes unique and very interesting objects. 

^ In [IKOS04], batch codes were defined for general t. However, in this work we solely 
consider t = I case as that seems to capture crux of the problem. 
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Combinatorial batch codes (CBCs). These are replication based batch codes; 
each of the N stored data items is a copy of one of the n input data items. So, for 
CBCs, encoding is assignment (storage) of items to servers and decoding is retrieval 
(reading) of items from servers. This requirement makes CBCs purely combinatorial 
objects. As combinatorial objects CBCs are quite interesting, they have received 
considerable attention in recent literature([PSW09, BKMSIO, BTllb, BTlla, BT12, 
SG14, BRR12, BB14]). Before proceeding further, we introduce the formal framework 
for our study of CBCs. 

We represent an {n,N,k,m)-CBC C as a bipartite graph Qc = {C,Tl,£). Set of 
left vertices C represents \C\ = n data items with vertex Ui G C representing data 
item Xi,! < i < n, and set of right vertices TZ represents \TV\ = m servers with 
vertex Vj € TZ representing server Sj,l < j < m. Hence, (ui,Vj) £ f is an edge 
in Qc if the data item Xi is stored in server Sj. Since the total storage is N so it 
follows that J2uec '^^^(m) = J2ven deg{v) = \£\ = N, where deg{.) is the degree of a 
vertex in Qc■ Now, it can be observed that any subset , Xi ^} of fc distinct data 

items can be retrieved by reading one item each from k distinct servers iff the sets 
r{ui-^),... ,r{ui^.) has a system of distinct representatives (SDR), where r(ur),r € 
{1,..., n} is the neighbourhood of the vertex Ur £ C, i.e., r(ur) = {u £ 7?., (u^, v) £ 
£}^. According to Hall’s theorem for SDR (cf. [Bol86], pp. 6) this is equivalent to 
the condition that union of any j sets T'(uq), ... r[ui.), {ui-^ ... iti .} C £ contains at 
least j elements for 1 < j < fc. These considerations lead naturally to the following 
theorem of [PSW09] which can also be thought as defintion of a CBC. 

Theorem 1 ([PSW09]). A bipartite graph Qc = {C,TZ,£) represents an {n,N,k, 
m)-CBC if |£| = n, \TZ\ = m, \£\ = N and union of every collection of j sets 

..., r(ui^), {rtij ,... ,Ui.} <Z C contains at least j elements for 1 < j < fc. 

From now on, we will identify the graph Qc = (£, 72., £") with an (n,N,k,m)-GBC^ 
and omit the subscript C as it will not cause any trouble. A CBC Q = {C,TZ,£) 
is called c-uniform if for each u G deg{u) = c, and it is called l-regular if for 
each V G R, deg{v) = P. Two optimization problems related to CBCs have been 
addressed in the literature: (i) given n,m,k find minimum value of N attained by 
a CBC (not necessarily uniform or regular) , and provide explicit construction of 
corresponding extremal CBCs; (ii) given m, c, fc, find maximum value of n, denoted 
as n(TO, c, fc), attained by a c-uniform CBC (not necessarily regular), and provide 
explicit construction of corresponding extremal CBCs. In this work we will consider 
the latter problem in setting of parameters where c and fc are constants while m is 
variable. 

At this point, it may be observed that though the definition and the considered 
problem draws some similarity with those of bipartite expanders, especially the un¬ 
balanced expanders [GUV09], there are important differences between these two cases 
as well. On the similarity side, both are bipartite graphs with constant left-degree; 
in both the cases, it is required that every subset of vertices C of up to a specified 
size should have neighbourhood of specified sizes, and it is desirable that |£| >> |72|. 
However, in the case of unbalanced expanders, the goal is to stretch the expansion 
of subsets^ (of specified sizes) of C as close to the left-degree as possible. Whereas, 
in case of CBCs, expansion of 1 is sufficient and it is more important to make |£| as 
large as possible with respect to |72|. Also important is the fact that the parameter 

^ In the sequel, we will require extension of the definition of neighbourhood of a vertex 
to neighbourhood of a subset. More formally, given S <Z jC, we denote by r{S) the set 
{u £ 72|3u £ S, (u, v) £ £} 

® Here the terminology is in keeping with the representation of a CBC as a set-system in 
some of the previous works, where the set 72 is treated as ground set and the multi-set 
{r{ui)... r{un)}, Ui £ £ is the collection of subsets. 

^ For a set S, expansion of S is 
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A: is a constant in case of CBCs (within our setting of parameters), whereas for ex¬ 
panders k varies with n. These differences make the (desirable) parameters in these 
two cases essentially unrelated. So, it seems unlikely that the existing constructions 
of unbalanced expanders can be immediately used for construction of CBCs where c 
and k are constants. 

Before discussing existing results and our contribution we briefly mention the 
notion of ‘explicit’ construction of a combinatorial object in general and CBCs in 
particular as that will be crucial to our discussion and result. 

Explicit construction. Construction of a combinatorial object with desirable prop¬ 
erties is computation of a representation of the object and is tied with the resources 
used for the computation. In the literature, those constructions which require practi¬ 
cally feasible amount of resources, such as polynomial time or logarithmic space are 
termed explicit. This can be contrasted with exhaustive search of a combinatorial 
object whose existence has been proven (e.g. by probabilistic argument); the search 
is done in the space of the object and requires infeasible amount of resources (e.g. 
exponential time). The notion of explicitness we will adhere to in this work is polyno¬ 
mial time constructibility, which requires that the time required for the construction 
be bounded by a polynomial in the size of the representation. Explicitness is further 
classified as following®. 

Globally explicit. In this case the whole object is constructed in time polynomial 
in the size of the object. For example, a globally explicit construction of CBC 
{C,TZ,£) would list all the members of £ in poly{\£\) time. 

Locally explicit. In this case the idea is to have quick local access to the object. More 
formally, for a desirable combinatorial object G, locally explicit construction of 
G is an algorithm which given an index of size log(|G|), outputs the member of 
G with the given index (or does some local computation on the member) in time 
polylog{\G\) . This is more specialized notion and depends on the context. For 
example, a locally explicit construction of a c-uniform CBC {C,TZ,£) would list 
the neighbourhood of a vertex v € C in time poly{\og{\C\), c) given the index of v 
(which is of size |£|). It can be seen that locally explicit construction is a stronger 
notion than globally explicit construction and is always desirable as it is useful for 
algorithmic applications. In fact, common notion of construction of combinatorial 
objects (e.g. using various algebraic structures) falls in this category. 

Now, we state relevant existing results for CBCs and subsequently discuss our con¬ 
tribution. 


Existing Results 

— In [IKOS04], the authors have shown, inter alia, that n{m,c, k) = I7(m‘^“®); this 
bound was obtained using probabilistic method. 

— In [PSW09], the authors have refined the above estimate using the method of 

ck 1 

deletion (another probabilistic technique, see [AS92]) to n{m, c, k) = ). 

They have also shown through explicit construction that n(m, k — l,k) = {k — 
1) (fc™i) I and n{m, k-2,k) = ■ 

— In [BB14], it was shown that n{m,c,k) = ) for 7 < k, and 3 < c < 

k — [logfc] — 1, and for k — [log A] < c < fe — I, it was shown through explicit 
construction that n{m, c, k) = 0{m£). For c = 2 case, the lower bound of (ii) was 

fc + 1 

improved (through explicit construction) to n{m,2,k) = for all fc > 8 

and infinitely many values of n. 

— In [BT12], the authors improved the general upper bound to show that n(m, c, k) = 

Oim ) for c < I — 1. 

All the constructions mentioned above are locally explicit. 

® Though the classification is with respect to polynomial time constructibility, it is appli¬ 
cable for other feasible resource bounds as well. 
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Our Contribution We address the question of explicit construction of uniform and 
regular CBCs. Here it is noteworthy that the aspects of regularity and uniformity 
have not been addressed together so far in the literature; addressing both the proper¬ 
ties together is interesting and important from theoretical as well as practical point 
of view. We provide a globally explicit construction which is uniform and almost 
regular. In particular, our result is the following. 

Theorem 2. Let c, k be positive constants. Then for all sufficiently large m, there 
exists c-uniform {n,cn,k,m)-CBC, where n = and number of items in 

each server is in the range ln(4m), ^ -I- ln(4TO)J. Moreover, the CBC 

can be globally constructed in poly{m) time. 

We use the construction of [IKOS04] and derandomize it using the method of condi¬ 
tional expectation (see [AS92]), and analyze it in greater detail. The analysis shows 
almost regularity of the construction and also improves the exponent of the lower 
bound as opposed to of [IKOS04]). Though the improved ex¬ 
ponent is inferior to the one obtained in [PSW09] )), the importance 

of the construction lies in its almost regularity and explicitness which are not known 
for the construction of [PSW09]. In fact, apart from the range k — [logfc] < c < k, 
where n = is achieved (see [PSW09] for c = /c — 1 and k — 2 and [BBI4] 

for the remaining values), and for the 2-uniform case ([BBI4]) there is no explicit 
construction in the literature. So, our construction serves to fill the void to some 
extent. 

To describe our construction, we give an algorithm which, for given positive in¬ 
tegers k, c, and sufficiently large to, runs in time poly{m) and outputs the edges of a 

bipartite graph {C,TZ,£), with \R.\ = to, |£| = n = "( 4 ^ 044 ^ , satisfying the following 
conditions. 

(a) Each left vertex in C has degree c and each vertex in TZ has degree in the range 
[f - v'tlii(4TO),f+ v'iln(4TO)J. 

(b) Any subset of f, 1 < i < fc, vertices in C has at least i neighbours in TZ. 

Note that there is a trivial non-explicit algorithm to construct the required graph 
which, given the input parameters, runs in time exponential in to; the algorithm 
searches the space of all possible bipartite graphs {C,TZ,£), with \TZ\ = to, |£| = n = 

c — 1+ 

; and outputs one that satisfies the above two conditions. 

In the proof of Theorem 2, we will need the following version of Hoeffding’s 
inequality. 

Theorem 3. (Hoeffding’s inequality/Hoe65/) Let Xi,X 2 ,..., Xn be independent 
random variables taking their values in the interval [0,1]. Let X = Xi. Then for 

-2<i2 

every real number a > 0,Pr{|X — E[Ar]| > a} < 2e~s“. 

Also, given a set S and a positive integer c(< |iS|), we will denote by (^), the set of 
all c element subsets of S. 

2 Proof of Theorem 2 

Proof of Theorem 2 is split into two parts. In the first part, we give probabilistic 
proof of existence (which essentially is also a randomized algorithm) of the CBC. In 
the second part, we derandomize the construction using the method of conditional 
expectation. This is a commonly used technique having its genesis in [ES73] and was 
later on applied to prove many other derandomization results (e.g. [RagSS, Spe94]). 
Informally, the method systematically performs a binary (or more commonly a d-ary) 
search on the sample space from where the randomized algorithm makes its choices 
and finally finds a good point. 

Proof of existence. We construct a bipartite graph Q = {C,TZ,£), where C is the 
set {iti,..., Un} of n left vertices, TZ is the set {vi ,..., Vm} of to right vertices, and 
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£ is the set of edges, in the following manner. For each vertex in C we choose its c 
distinct neighbours by picking randomly, uniformly, and independently a subset of 
c vertices from TZ, i.e., its neighbourhood is an independently and uniformly chosen 
element of (^). So, for u G C, S' QTZ, 

•|y| 


Pr{riu)GS'} = ^< 


m 


Next, for a subset S' C £, IS”] = i, c + 1 < i < fc and a subset S' C 7^, |S'| = i — 1, we 
say that event Bads,s' has occured if F(S) C S'. So, we have 

Pr{Bads^S'} < * 


m 


(1) 


using independence of the events r(u) C S' for u € C. Now, our goal is to bound 
the probability of occurence of any Bads,s', S C C, S' C TZ, and c + 1 < i < A:. To 
this end, we have 

E V—^ V—^ r 1 T X—^ ^ \ \ I 1' — 1 

E E Pr:{Bads,s'} < E 

c-\-l<i<k SgC, S'gTZ, c-\-l<i<k 

|S|=i |s'|=i_i 

l<i<k 


m 

i — 1 


m 


^ E 




1<2<A: 


m 


^ E 


< 


1 


m 


c-i+v 


since n = 


4^c+i 


( 2 ) 


Next, for u G C,v G TZ define the indicator random variable such that 

iu,v)G£ 

" [0 otherwise, 

and Xy,v G TZ, he a random variable denoting the degree of vertex v. Clearly Xy = 
X] Xy. Now, Pr{X“ = 1} = ;^. So, by linearity of expectation, 

u^C 


E[X„] = E 




j.ec 




u^C 


From the fact that neighbourhoods of vertices u G C are chosen independently, it 
follows that the variables X" for u G C, and a fixed v GTZ, are mutually independent. 
So, applying Theorem 3 with a = ln(4TO) we have 


Pr \Xy - E[X„]| > \n{4m) )■ < 2(4m) ^ 


(3) 


So, by union bound, probability that the event \Xy — E[X„]| > ln(4TO) occurs 
for some v,v G TZ is bounded by 


ven 


^ Pr - E[Xy]\ > JEn(4m) < -. 


(4) 


Hence, from equations (2) and (4), with probability at least 1 — {j none 

of the above events occur. □ 


Derandomization. Before presenting the algorithm we derive expressions for ex¬ 
pected number of Bads.s' events and expected number of vertices v G TZ for which 
\deg{v) — ^\ > \J\ ln(4m) conditional on fixed choices of F(mi), ... ,r{ut). Then 
we show that if at t-th stage, 1 < t < n, (having fixed F(mi), ..., T(itt_i)) choice of 
T(ut) is made in such a way to minimize the sum of these two expectations then in the 
final graph, which is no longer random since all the neighbourhoods are fixed, there 
are no events Bads.s' and no vertices u € 7^ for which \deg{v) — ^\ > \/\ ln(4m), 
i.e., no violations of conditions (a) and (b). So, the derandomization proceeds in n 
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stages; the beginning of stage t, neighbourhood of vertices ui,..., Ut-i G C are fixed, 
and neighbourhood of ut, r(ut) G (^) is fixed in such a way that minimizes the ex¬ 
pected number of violations of conditions (a) and (b). The algorithm (Algorithm 1) 
is immediate from these observations. 


First, we introduce indicator random variables Ys^s' corresponding to each event 

Bads, S’, i-e., 

" 1 if r{S) C 5" 

0 otherwise. 




Also, we define Y = J2c+i<i<kJ2scC, S'cn, Xs,S'- By linearity of expectation 

IShi |S'|=i-l 

we have 


E[r] = E 


E E E 

c+l<i<k Sec, S'cn, 
|S|—z |5^[—z —1 


= E E E E[rs,s.]= E E E Pr{Xs.S'} < Erom (2) 


c+l<z<fc 5c£, S'cn, 
|5^|—z—1 


c+l<z</c5c£, 5'C7^, 

|S|=i |S'|=i-l 

Let {ui,U2,... ,ut} C C, and Ci,C2, ■ ■ ■ ,Ct G (^) be fixed subsets such that r{uj) = 

< j < t, and for the remaining vertices in C, their neighbourhoods are chosen 
independently, and uniformly at random from (^). Let S C C,S' |5| = i, |«S"| = 
i — 1 be fixed subsets for some i,c + 1 < i < k, also let fF = 5" fl {iti, U 2 , • ■ • j Ut}, 
\W\ = w, and r{W) = 0 for fF = 0. Then we have 

E[Fs.snP(^^i) = Cl,. .., r{ut) = Ct] = Pr{T(^) C ^'|r(ui) = Ci,....... r{ut) = Ct} 

'o if r(tF)^S'' 

(5) 



otherwise. 

So, by applying linearity of expectation and from above 
E[F|T(ui) = Ci,...,T(ui) = Ct] 

= E E E nys,s’\nu^) = c,...,...r{ut) = c] 

c+i<i<k sec, S'cn, 

|S|=i |5'|=i_i 

= E E E Pr{C(5)C5'|T(ui) = Ci,...,...T(uO = Ca 

c+l<i<k SeC, S'cn, 

|S|=i |S'|=i-l 

= E E E 


(T) 


( 6 ) 


c+l<i<k Sec, S'cn, 

|S|=i |S'|=i-l 

r{w)cs' 

Next, corresponding to each vertex v € TZ we introduce an indicator random variable 
Zy such that 


|deg(u)-f|> Vfln(4m) 

[ 0 otherwise, 

and define Z = ^ Zy. So, by linearity of expectation we have 
vcn 


E[Z] = E 


.vcn 


^ nZy] = = 1} < J from (4) 


vcn vcn 

Like in the previous case, our goal is to estimate E[Z|T(zii) = Ci,..., r{ut) = Ct] by 
estimating E[Z„|r(ui) = Ci,... ,r{ut) = Ct] for each v GTZ. For a fixed v G TZ, let 
I = |{mj|u G r{ui), 1 <i < t}\. Let a = ^ - ^§ln(4m), and p = ^ + y'§ln(4m). 
Then we have 

nz\r{ui) = Ci,...,r{ut) = Ct] 
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= ^ E[z„|r(ui) = Cl,.. .,r{ut) = Ct] 

= ^ Pr{deg(i;) < a — I or deg{v) >13 — l\r{ui) = Ci,..., C(ut) = Ct} 


r: ©‘(‘-3 - 1 : r: 

i=0 ^ ' i—^ — l +1 ^ ' 

Finally, we show that if at j-th iteration (having fixed C(ui) = Ci,... ,r(uj-i) = 
Cj-i at the beginning) r{uj) = Cj is chosen so as to minimize E[y + Z|F(iti) = 
Cl,..., r{uj) = C], C S (^), then in the final graph, which is no longer random, 
conditions (a) and (b) are met. To this end, we first observe that 
E[r + z|r(ui) = Ci,...,c(wt_i) = Ct_i] 

^ E[y + z|c(Mi) = Ci,...,r(ut) = c] 

(T) 

> min E[r + Z|C(Mi) = Ci,...,C(Mi) =<^]- (7) 

ce(^) 

Hence, it follows that 

min E[y + Z|C(Mi) =Ci,...,C(w„) = 

< min E[r + Z|r(ui) =Ci,...,r(u„_i) =C„_i] 


< min E[r + Z|C(Mi)=Ci]<E[r + Z]<-. (8) 

CiG^^) 4 

Since Y and Z are integer valued random variables, the above essentially means 
that at the end, when all the neighbourhoods C(ui),..., r{un) are fixed, Y = 0 and 
Z = 0. So, both the conditions (a) and (b) are met. Now, we have the following 
straight-forward algorithm to construct the bipartite graph. 

Input: Positive constants c,k, and sufficiently large m. 

Output: A bipartite graph {C,TZ,£), where 

C = {ui,U 2 , ■ ■ ■,Un}{n = ) and TZ = {vi,V 2 , ■ ■ .,Vm} such 

that r{uj) = Cj € (^), 1 < j < n meeting conditions (a) and (b). 
a = f - Vf ln(4m), and /3 = f ln(4m); 

for j ■«— 1 to n do 

Uj-i = {mi, U 2 , ..., Wj-i}, min <r- 1 


for C e C^) do 


E E 


E 


c-\-l<i<k Uj^ScC, S'ClZ, 

|S|=i |S'|=i-l 

r(Uj-ins)uccs' 
a-|c/i_inr(«)|-|{«}nc|-i 

E 

v^TZ 2=0 

n — j 

l^~ J 

+ E 

i=li-\Uj-tnr(v)\-\{v)nc\+i 
if min > Y' + Z then 

r{uj) = c 

min -i^Y' + Z 

end 




C) 




/ c y / c 

\m) V TO/ 

f—fl - ^ 

Vto/ V to/ / 


end 
end 


Algorithm 1 : 










Proof of correctness of the algorithm. At the beginning of j-th iteration, 
r{ui) = Cl,... ,r(uj-i) = Cj-i are fixed and the algorithm selects C = Cj which 
minimizes Y' + Z for given r{ui) = Ci,... ,r{uj-i) = Cj-i, r{uj) = C. Note that 
in (5), E[Ys_ 5 /|A(ui) = Ci,... ,r(ut) = Ct] is independent of particular choice of 
Ci if Ui ^ S. So, in j-th iteration of the algorithm, while computing Y', only those 
summands £[¥ 5 ^ 5 /|A(mi) = Ci,... ,r{uj) = C] are considered for which Uj G S. 
Hence, Y' < E[ys_s/|r(ui) = Ci,..., r{uj) = C], and particular choice of C = Cj 
which minimizes Y' + Z for given r(ui) = Ci,..., r(uj-i) = Cj-i, r{uj) = C also 
minimizes E\Y + Z\r{ui) = Ci,..., r{uj-i) = Cj-i,r{uj) = C]; by ( 8 ), this also 
justifies setting min to 1 at the beginning of j-th iteration for 1 < j < n. Hence the 
proof follows from the discussion proceeding Algorithm 1. 

Runtime of the algorithm. Now, we present a coarse analysis of the algorithm 
which is sufficient to indicate that the algorithm runs in time poly(rn). For that, we 
first estimate the time required by the algorithm to compute F®. Note that the time 
reqired to compute (™) and is 0{m^) (through dynamic programming); the 

exponentiation takes time O(logfc), and these operations are done 0{kn^~^m^~^) 
times to get the summation. So, the time required by the algorithm to compute Y 
is Similarly, in the case of computing Z, the binomial coefficients 

("“*) takes time 0{n^) to be evaluated, exponentiations take time O(logn). So, 
the overall time requirement in this case is 0(mn® log n) = log m). The 

above two steps are done 0(nm'^) = times. So, the overall complexity of the 

algorithm is 

3 Concluding Remarks 

Limitations of the construction. It can be observed that the algorithm crucially 
depends on the fact that fc is a constant, and this limits its applicability to wider 
setting where k is allowed to vary. Apart from being globally explicit, which, as 
discussed in the beginning, is a weaker notion of explicitness, the construction is on 
the slower side (even in terms of the number of edges which is 0(m'^)), as indicated 
by the above analysis. One of the possible approaches to speed-up the algorithm is 
discussed below. 

Towards derandomization in NC. It can be observed from the first part of The¬ 
orem 2 that the construction is in RNC, i.e., the construction can be carried out on a 
probabilistic Parallel Random Access Machine (PRAM) (see [MR95] for definition) 
with poly{m) many processors in constant time^. It is naturally interesting to inves¬ 
tigate AO-derandomization of problems in RNC i.e., whether the same problem can 
be solved using a deterministic PRAM under same set of restrictions on resources. 
Two of the most commonly used techniques employed for such derandomization are 
the method of conditional expectation and the method of small sample spaces (see 
[AS92]); sometimes they are used together [BR91, MNN94]. 

While the method of conditional expectation performs a binary search (or more 
commonly a d-ary search) on the sample space for a good point, method of small sam¬ 
ple spaces takes advantage of small independence requirement of random variables 
involved in the algorithm and constructs a small sized (polynomial in the number of 
variables) sample space which ensures such independence, and searches the sample 
space for a good sample point. Since the sample space is polynomial sized the search 
can be done in polynomial time, and hence leads to polynomial time construction. 

In the proof of Theorem 2 we used independence twice; in (1) we used k-wise 
independence and in (3) we used Hoeffding’s inequality (Theorem 3) which requires 

® We consider RAM model (see [MR95]), so addition, multiplication, and division are 
atomic operations 

^ In fact, it can be seen that the construction is in ZNC with the expected number of 
iterations at most 4 
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the involved random variables Xi, ..., to be n-wise independent®. However, such 
independence comes at the cost of a large sample space. More precisely, in [ABI86] 
it was shown that in order to ensure /c-independence among n random variables the 
sample space size have to be f 2 (n 2 ). So, in case of Theorem 2 requirement on the 
size of sample space is huge (f?(rn’”)). However, we want to point out here that the 
requirement of n-wise independence in Theorem 2 can be brought down to 0(ln(m))- 
wise independence with the help of following limited independence Cheronoff bound 
from [BR94]. First we state the bound. 

Theorem 4. ([BR94]) Let t > 4 be an even integer. Suppose Xi,... ,X„ he t-wise 
independent random variables taking values in [0,1]. Let X = Xi -f • • • -f Xn, and 
a > 0. Then 

Pr{\X-E[X]\>a}<Ct(^^y 

, where Ct is a constant depending on t, and Ct < 1 for t > 6. 

Now, in inequality 3, we need ^ in the r.h.s. This can be achieved by setting 
t = 21n(2m) in the above theorem (for simplicity we assume 21n(2m) to be even), 
and a = \j2en ln(2m). Hence, 0(ln(m))-wise independence in choosing T(m) for 
M G T is sufficient for the randomized construction (with somewhat inferior bound 
on the deviation of the degrees from the average). 

In [BR91, MNN94], the authors developed frameworks for NC-derandomization 
of certain algorithms (notably, the set discrepancy problem of Spencer [Spe94]) that 
require 0(log°(n))-wise independence. One of the vital points of these frameworks is 
parallel computation of relevant conditional expectations for limited independence 
random variables in logarithmic time. In case of Algorithm 1, this means compu¬ 
tation of Y' and Z by poly(rn) processors in polylog(m) time under 0(ln(m))-wise 
independence among random choices of r(u) for u G C. At present it is not clear 
to us how this can be achieved in the frameworks of [BR91, MNN94] and seems to 
require more specialized technique. 
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